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Abstract 

Decomposition of any Boolean Function BF n of n binary inputs into an optimal in- 
verter coupled network of Symmetric Boolean functions SFk (k < n) is described. Each 
SF component is implemented by Threshold Logic Cells, forming a complete and compact 
T-Cell Library. Optimal phase assignment of input polarities maximizes local symme- 
tries. Rank spectrum is a new BF n description independent of input ordering, obtained 
by mapping its minterms onto an othogonal nxn grid of (transistor-) switched conductive 
paths, minimizing crossings in the silicon plane. Using this ortho-grid structure for the 
layout of SFk cells, without mapping to T-cells, yields better area efficiency, exploiting 
the maximal logic path sharing in SF's. Results obtained with a CAD tool "OrfoZog" 
based on these concepts, are reported. Relaxing symmetric- to planar- Boolean functions 
is sketched, to improve low- symmetry BF decomposition. 

1 Introduction 

Since the early eighties the synthesis of combinational logic for the design of integrated circuits 
(IC's) is increasingly automated. Present logic synthesis tools, near the bottom of the IC 
design hierarchy, just above layout, is fairly mature, being intensively applied in the design 
of production IC's. But some problems remain: 

A. Logic synthesis tools often have a disturbing order dependence. Re-ordering signals, 
which should not affect the result, can cause a considerable increase or decrease of silicon 
area. To curb computer time, synthesis tools avoid global analysis which tends to grow 
exponentially with the number of inputs. Hence a local approach is preferred, using a greedy 
algorithm, taking the first improvement that comes along. The result then depends on the 
ordering of cubes in a PL A listing, or the input order in a BDD (binary decision diagram) 
[1][2][3] representing a Boolean function (BF). This effect is reduced by global analysis, and 
by symmetric function components SF, being independent of input ordering. CPU time is 
reduced by the 'arithmetization' via spectral BF n analysis, a new method of characterizing 
BF's, to be explained. 

B. Optimal polarity or phase assigment of signals, either inputs or intermediate variables, 
is still an unsolved problem, although some heuristics are applied. Input phases influence 
logic symmetries, to be exploited for an efficient decomposition, that is essentially synthesis. 

C. The use of a standard cell library is forcing decomposition- and cell mapping stages 
to produce a sub optimal gate network, versus compiled cells as needed [4]: using no cell 
library but a programmable grid template, to be discussed. The proposed 'orthogrid' BF 
structure is an experiment in that direction, to be extended to planar F's beyond symmetric 
-F's as 'grid template' alternative to FPGA or FPMUX cells [5]. Performance prediction, 
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that comes with a cell library, is then done by the cell compiler, which is quite feasible, 
replacing library maintenance by compiler support. 

D. Complete testing of combinational logic circuits requires irredundancy, guaranteed 
only in sum of cubes 2-level implementation. Logic in factored form, the usual result of 
a synthesis tool, sometimes has testability problems. Restriction to a disjoint product is 
proposed, with factors having no common input. This guarantees the irredundancy needed 
for BF testability in factored form. And: disjoint products yield a spectral calculus , with a 
BF rank spectrum independent of input ordering, and a convolution composition rule. 

Order independent Logic Synthesis: 

The mentioned problems in present synthesis CAD imply that no optimality (nor full testa- 
bility) is guaranteed, nor does one know how close/far the optimum is. Presently, only by 
many synthesis runs (design space exploration) a feeling is obtained for the complexity of 
the functions to be synthesized, allowing a trade-off between circuit area , -delay, and power 
dissipation, however at a high cost in CPU time. 

Our aim is to improve this situation, crucial for the future of digital VLSI systems. The 
emphasis is on order- independent function representation, using a spectral technique called 
rank spectrum, and on global analysis before synthesis, which then becomes feasible. In fact 
we go one step beyond BDD type of BF descriptions, by mapping minterms as paths in an 
orthogonal grid, using symmetric F's and signal phasing. 

Then methods similar to those applied in signal processing, like the frequency spectrum, or 
convolution of impulse response and input sequence in the time domain, can also be applied 
to Boolean functions. This yields: 

— synthesis by global structure analysis, 

- with arithmetization of Boolean algebra 

— via a rank-spectrum technique. 

2 Ortho grid, rank spectrum 

Def. Orthogrid plot: map each minterm of a BF n (as 0/1 string of length n) in an 
orthogonal grid, as an n-step path from the origin to the n-th diagonal. In input sequence, 
step down if '0', and right if '1' (see fig. 1). 

This models a pass transistor network on silicon, with a conducting path from the origin to 
the n-th diagonal for the given minterm. Oi?-ing all paths yields function F=l only if some 
path connects origin to final diagonal. 

For n inputs, each path ends on the n-th diagonal. All minterms of equal rank (number of 
ones) end in the same point on the n-th diagonal. Without confusion such minterm-set is 
also called a rank of F. For the orthogrid plot of a single rank XOR product (4 terms, rank 
2) see fig.l. 

Def: a rank function RF has only one non-empty rank (equal rank minterms). 

Def: BF n rank spectrum is the vector of path (minterm) counts per rank [0 - n] 

A BF n is the sum of its rank functions, and its rank spectrum is independent of input ordering. 
In general, crossing paths are not allowed to touch each other, to be drawn with a bridge 
or tunnel. This makes the ortho grid style cumbersome for larger functions, and probably 
explains the popularity of the Shannon-tree, which can be displayed free of crossings, that 
is: as a planar a-cyclic graph. However, path sharing is essential to recognize common 
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factors, which is a clue to logic synthesis, showing the power of BDD's and the othogrid 
representation. 

Planar node: factoring paths 

Def: a node is planar if all paths connect there, (e.g. the circled node in fig.l). 
So all such paths are cut in two parts: each first section from the origin is continued (multi- 
plied) by all second sections to the final diagonal. 

A function F with all paths (minterms) passing through a planar node is a product of two 
functions F = G*H sharing no inputs, where G is a rank function; here G(a, b) and H(c, d). A 
planar node plays the role of a factor node. Planarization is essential for synthesis, obtained 
by proper choice of order and polarity of inputs. 
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Fig 1. Gridplot of F= XOR pair product. Fig 2. Planarize: permute/invert inputs 

Counting occupied gridpoints (nodes), multiple for non-planar nodes, yields a good criterion 
for a logic optimization algorithm {planarization): 

Factoring criterion: Permute and invert {phase) inputs to minimize node count N. 

Alternatively, the number of links L, counting the transistors, could be minimized. Node 
count N dominates over link count for practical technological reasons. A bridge requires two 
via's to another metal level, costing more than a transistor which is simply a polysilicon line 
crossing (self-aligned) a diffusion path. Permuting and inverting inputs, factored form fig.l 
has minimal (TV, L) = (6, 8) of the three gridplots of F. 

This orthogrid representation allows characterization of special types of Boolean functions 
such as symmetric-, planar- and rank- functions, to be considered next. Notice the maximally 
2 n minterms are plotted in a square grid of n 2 nodes, by virtue of dense path sharing as partial 
factors. Actually a half square suffices, up to diagonal n; the other half plane could be used 
for the complement or dual of F (as in CMOS). 



3 Symmetric and Threshold BF's 

The well known Pascal Triangle, displayed in orthogonal grid fashion (fig. 3), gives in each 
node the number R{i,j) of all paths connecting that node to the origin. This is easily verified 
by its generation rule: R{i,j) = R{i — l,j) + R{i,j — 1) is the sum of its predecessor node 
path counts. Induction yields the path counting rank spectrum. 

The AOi?-product function F (fig.l) is not symmetric in all inputs, but it has two partial 
symmetries or input equivalences (permute without changing F), written a = b and c = d. 
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The Ortolog algorithm (sect. 5) detects and enhances such partial symmetries. 

F l-l-l-l-l 

Ranks . . 4 

spectr[l 4 6 4 1] 
Fig 3. Binomial path-count for full ranks. 

A rank=2 symmetric function in 4 inputs contains all minterms of rank 2, otherwise it 
cannot be an SF: there are (4 choose 2) = 6 minterms, in fact a full rank has a binomial 
coefficient number of minterms. Notice in fig.l there are two paths missing from a full 
rank=2: 0011 and 1100 (see dotted lines), so F is not symmetric. 

— Symmetric functions 'count' — 

Def. a symmetric function SF 

does not change by permuting its inputs. 

In other words, a function SF is symmetric in all inputs if it depends only on the number of 
1-inputs, and not on their position. Its ranks are either full or empty, so: 

A symmetric function SF[R] is determined by the set R C [0, ..,n] of its full ranks. 

An n-input function has n+1 ranks, with 2 n+1 subsets, which is the number of symmetric 
functions of n inputs. For instance the parity function is symmetric, active for an odd number 
of 1-inputs, so the odd ranks are full, and all even ranks empty: SF[odd\. 




OR [1,2,3] AND [3] FA: sum [1,3] carry [2,3] 

Fig 4. OR, AND, Full Adder(siim, carry) 

Symmetric functions count, typical for arithmetic. The well known OR function of n inputs 
is symmetric, written SF n [>0]: at least one high input, so only rank is empty. The n-input 
AND function is SF n [n], active only if all n inputs are high, so only rank n is full (containing 
just one minterm). And in a 3-input Full- Adder (FA): sum s=l when 1 or 3 inputs are high, 
so ranks [1,3] are full, written s = SF[1,3], while the carry c=l when 2 or 3 inputs are high, 
so c = SF[2,3]. 

Most BF however are not symmetric in all inputs, although many have partial symmetries 
(in some inputs). A factored function F cannot be symmetric, since inputs to different factors 
are not equivalent. So an SF has no factor, explaining why most logic synthesis tools, based 
on factoring, have trouble with efficient decomposition. 

This suggests putting SF's in the Cell Library, with 2 k SF^ cells of k inputs, halving the 
number of cells by using an inverter to exploit SF(—X) = —SF(X). 

T-cell library, threshold logic cells 

Threshold logic functions TF < SF can implement any SF, in a simple fashion. 

Def. A threshold function T\. of n inputs has threshold k € [1, ..,n] with T^=l whenever 

at least k inputs are active (high). 

Any interval of SF fullranks can be implemented by the AND of two threshold 
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functions: Tj . Tj. So an SF with m fullrank intervals is the sum of m TF pair products. 

For instance the FullAdder sum output (fig.4) with interval [1,2] yields: 5[1,3] = (Ti.7^)+T 3 , 
using the inverse of carry Ti- 

There are just n TF functions of n inputs, with thresholds 1, ..,n - forming a compact and 
complete T-cell Library. Including an inverter, a T-cell library contains sum(l, ..,n) = 
n(n + l)/2 cells, that is 10 cells if n=4, or 15 cells for n=5. This is less than a complete 
S-cell library of l+(3+7+15)=26 cells (n=4), or 57 cells (n=5), which however will yield 
smaller synthesized circuits (re section 6: further research). 

4 Planar cut and factoring 

The two basic causes for asymmetry are: factoring and inverse. 

The smallest asymmetric functions are: a(b + c), a + be and a b, a + b. 

The first two cases use both (.) and (+) where the role of a essentially differs from b, c which 
are equivalent (permutable) . The last two cases are asymmetric in (a, b), but symmetric 
in (a, b). In general, input phasing costs little, making a function more symmetric and 
increasing local symmetries (with dense path sharing), essential for logic optimization (fig. 1,2) 

Spectral product, and planar cut: Function F = G(X) H(Y) is a disjoint product if 
factors G and H share no inputs, so X n Y is empty. Multiplying the rank spectra sp(G) 
and sp(H), as a convolution, yields the spectrum of composition F: 

sp(F) = sp(G) * sp{H). 

Order input sets X and Y adjacent in the gridplot. Then this spectral product rule follows 
since each path in G{X) is continued by (in product with) each path in H(Y), to form all 
paths (minterms) of length \X\ + \Y\ in F. Let \X\ = m then the gridplot of F has diagonal 
m consisting of only planar nodes, with corresponding factor property: planar cut (sect. 5 
algorithm step 3). Let G = a # b and H = c + d + e with spectra G[0, 2, 0] and H[0, 3, 3, 1] 
then the product spectrum is [0, 3, 3, 1].[0, 2, 0] = [0,0,6,6,2,0] by 'longhand' multiplication 
(without carry). 

5 ''Ortolog'' fast algorithm 

The Ortolog algorithm is designed for global yet fast detection of (partial) symmetries, en- 
hancing them by input phasing. The rank spectrum is a simple and fast symmetry test for 
any sub function, by checking if each rank is full or empty. 

The input format is that of a PL A (2-level or /and logic), hence a list of cubes as generalized 
minterms, each with all n circuit inputs (length n strings over 1/0/- for input straight /in- 
verse /independent). The algorithm is double recursive: start with a minimized 2-level logic 
BF n (X) as a list of m cubes, and proceed as follows: 

1. Core(a, b): for each input pair (a, b) find the cubes symmetric in a, b. 
Maximize each core by chosing input phase a if Core(a, b) has more cubes. 

2. Input-expand maximal (phased) paircores to Core(a, b, Y) with inputs c (or c) in rest 
input set Y. Stop criterion: max \Core\ x \inputs\ 2 prefers wide (more inputs) over 
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deep Core (more cubes). Select one such 'best' multi input Core(Z), symmetric for all 
inputs mZ CX. Let Y = ~Z = X — Z. 

3. Factorize Core(Z)=J2o G r (Z) * H r (Y) for ranks r < n with non-zero symmetric rank- 
functions G r (Z) as factors (planar cut). 

4. Recursively decompose (1-4) cofactors H r untill all components are symmetric. 

5. Recursively decompose (1-5) remainder F(X) — Core(Z), yielding an optimally phased 
network of symmetric functions coupled by inverters. 

Speedup option: initially partition F by collecting cubes with equal number of dont-cares 
(DC class), since cubes symmetric in the same subset of inputs likely have the same number 
of DCs. Decompose the k subfunc's FDC\ separately: F = J2iFDC\. 

The SF components can be implemented by T-cells, if a small T-cell library is preferred. 
However, not decomposing the SF cells yields better area efficiency, using their grid plot 
as layout pattern on silicon (grid template), maximally sharing logic paths. 

The algorithm time complexity is 0(n 2 m), for a BF n list of m cubes with n inputs (step 
1 is quadratic in n). So only quadratic in the number of inputs (not exponential), and linear 
in the number of cubes. This allows very fast synthesis of many alternatives in a search 
for an optimal binary code at a higher level: error correction codes in Boolean circuit design 
[6] [7] [8] or state-machine logic: FSM state coding [9]. 

5.1 Experiments 

The described symmetric synthesis with a cell library of 15 T-cells (up to 5 inputs), was com- 
pared with a known tool Ambit (Cadence) using either a basic libary of AND n /OR n /INV 
(n=2..5) cells, or the usual extensive (full) libary of several hundreds of cells. The logic den- 
sity 'dens' is the filling % (non-DC) of the PL A table to be decomposed. Rather than number 
of cells, the total number of cell pitches (#p) is compared in Table 1, as area estimate: 



cct inp cub dens Synthesized #pitches 



!__+__+ °/ o 


— Ambit — 


Ortolog 


Ratio 


binom5 6 32 74 


(126) 


128 


148 


0.86 


cordic 22 27 24 


(135) 


226 


194 


1.16 


table3 14 52 75 


(448) 


718 


902 


0.80 


parity 4 8 100 


( 18) 


41 


48 


0.85 


Cell Library: 


(Full) 


AOI 


TC 


A0I/TC 



Table 1. Synthesis areas (Standard cell # pitches) 

6 Further research 

Extend symmetric to planar functions: The efficiency of decomposing to a network 
of symmetric boolean functions clearly depends on the amount of (local) symmetries in the 
initial BF. Table 1 shows that restriction to a library of AND /OR (column AOI) resp. 
threshold T-cells (column TC) is too severe: results do not compete with the usual large cell 
library, except the cordic circuit which has "much structure", viz. many local symmetries. 
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Symmetric components SF^ (with dense sharing of logic paths) should not be mapped onto 
T-cells, but rather be implemented directly as planar compiled grid cells: 

Def: a planar Boolean function PF n has a planar grid-plot (permute / invert inputs). 

Notice that each symmetric SF has only planar nodes in its gridplot, hence is planar. Let 
a link be a path of length=l anywhere in a gridplot. Then any SF n is the 'template' for a 
class of PF n easily derived from it by removing one or more links. Obviously, any PF n has 
a unique smallest covering SF n . 

The class of PF is much larger than SF, while being easily derived by 'programming' (deleting 
links from) the SF's as templates. The number of links in any SF n is maximally J2i 2i = 
n(n+l), hence quadratic in n, rather than exponential as in the case of look-up table FPGA's. 

The number of PF n , between \SF n \ = 2 n+1 and \BF n \ = 2 2 " , requires more research. All 
BF3 are planar, and likely all BF4 as well, while non-planar BF n have n > 5. 

Conclusions 

The symmetric T-cell library is too restricted to compete with the usually very large cell 
libraries, since most BF n do not have many sizable local symmetries. The area cost of 
lacking special cells (e.g. XOR in parity), and T-cell mapping of SF's is high. 

The Ortolog algorithm performs fast global analysis, including phase assignment, of local 
BF n symmetries. It detects and enhances, by input phasing, the (dense) symmetric parts 
of a circuit, for separate symmetric synthesis. The remaining (sparse) asymmetric logic can 
be synthesized otherwise. Flexible compiled cell logic synthesis, using the larger class of 
planar BF, can derive from symmetric SF n as programmable n x n grid template. 
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